Use half floats for most floating point numbers#1941
Merged
ruflin merged 1 commit intoelastic:masterfrom Jul 4, 2016
Merged
Conversation
Starting with 5.0.0-alpha4, Elasticsaerch supports half floats. [Half precision floating-point](https://en.wikipedia.org/wiki/Half-precision_floating-point_format) numbers have good precision for small numbers, but they degrade for larger numbers. This is switching to half floats all the fields that are percentages (values between 0 and 1) and the fields that were floats but have naturally small values only. After going through the list, most of our floating point number fit in one of these categories. Only 3 float fields were left untouched. The template generator automatically switches to "float" when generating the ES 2.x template files.
Contributor
Author
|
jenkins, retest it |
Contributor
|
Thanks a lot for taking this one. |
Contributor
|
I'm curious whether this significatly reduces the size of indices? |
Contributor
|
See also elastic/elasticsearch#19264. |
jpountz
added a commit
to jpountz/elasticsearch
that referenced
this pull request
Jul 18, 2016
This is a tentative to revive elastic#15939 motivated by elastic/beats#1941. Half-floats are a pretty bad option for storing percentages. They would likely require 2 bytes all the time while they don't need more than one byte. So this PR exposes a new `scaled_float` type that requires a `scaling_factor` and internally indexes `value*scaling_factor` in a long field. Compared to the original PR it exposes a lower-level API so that the trade-offs are clearer and avoids any reference to fixed precision that might imply that this type is more accurate (actually it is *less* accurate). In addition to being more space-efficient for some use-cases that beats is interested in, this is also faster that `half_float` unless we can improve the efficiency of decoding half-float bits (which is currently done using software) or until Java gets first-class support for half-floats.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Starting with 5.0.0-alpha4, Elasticsearch supports half floats.
Half precision floating-point numbers
have good precision for small numbers, but the precision degrades fast for larger numbers.
This is switching to half floats all the fields that are percentages
(values between 0 and 1) and the fields that were floats but have naturally
small values only. After going through the list, most of our floating point
numbers fit in one of these categories. Only 3 float fields were left untouched.
The template generator automatically switches to "float" when generating the
ES 2.x template files.
This closes #1936.